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References to the Handbook and to units are given (where appropriate). 
A reference such as ‘Handbook, B1:6’, for example, relates to point 6 in the 
unit outline for Unit B1. 


PART 1 


In many cases below, the working that leads to the correct option(s) is 
included. Please note that you will not need to provide any such working to 
answer Part 1 questions in the real examination. 


Question 1 (Unit A1, Subsection 4.2) 
Correct options: C and F (1 mark each). 
The lower and upper quartiles are (Handbook, A1:5) 
) = 2) +326) — Fey) 
= 1.07 + ¿(1.11 — 1.07) = 1.08 


4L = U4 g44y) — F (28 
and 
qu = ©(3(841)) = (63) = T6) + F20) — t6)) 
= 1.14 + 3(1.26 — 1.14) = 1.23. 


Question 2 (Unit A2, Subsection 1.1) 
Correct option: B (2 marks). 
For these data, 
qu + (1.5 x igr) = 39.5 + (1.5 x 13.5) = 59.75. 
The highest observation not exceeding 59.75 is 56. 


So the upper adjacent value is 56. 


Question 3 (Unit A1, Subsections 4.1—4.3) 
Correct options: A, F and H (1 mark each). 
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Question 4 (Unit D3, Subsection 3.1) 
Correct option: E (1 mark). 


Proportion of accidents attributable to unsafe working conditions 


Total number of ‘Yes’ 46 


= esse 3: 
Total number of accidents 200 


Question 5 (Unit D3, Subsection 3.1) 
Correct option: A (1 mark). 


Proportion of accidents that occurred on the night shift 
Total number of ‘Night? 54 


— SS S27. 
Total number of accidents 200 


Question 6 (Unit D3, Subsection 3.1) 

Correct option: C (1 mark). 

P(accident on night shift was due to unsafe working conditions) 
__ Number of ‘Yes’ on the night shift 10 


= = = 0,185. 
Total number on the night shift 54 


Question 7 (Unit D3, Subsection 3.1) 
Correct option: F (1 mark). 


P(accident that was attributable to unsafe working conditions occurred on 
the day shift) 


_ Number of ‘Yes’ on the day shift 20 _ 0.435 
7 Total number of ‘Yes’ ~ Ag 


Question 8 (Unit C3, Subsection 1.2) 
Correct option: B (1 mark). 


There are four wheel bearings, which can be either damaged or not 
damaged, and the number of these four that are damaged, X, is of 
interest. As there are two possible outcomes for each wheel bearing, being 
damaged can be considered as a ‘success’ so that a binomial distribution 
(with n = 4) would be suitable for X. 


Question 9 (Unit C3, Subsection 1.3) 
Correct option: F (1 mark). 


Y is a continuous variable. It is likely that the lengths of antennae would 
be clustered around a central value (which would suggest a normal 
distribution rather than a uniform distribution), and that this central 
value would be greater than 0 (which would suggest a normal distribution 
rather than an exponential distribution, which has a mode at 0). 
Therefore a normal distribution would be suitable for Y. 
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Question 10 (Unit C3, Subsection 1.2) 
Correct option: D (1 mark). 


Traffic accidents are occurring at random, and Z is the number of traffic 
accidents occurring during a fixed interval of time, namely each day, so a 
Poisson distribution would be suitable for Z. 


Question 11 (Unit C3, Subsection 1.3) 
Correct option: E (1 mark). 


U is a (continuous) waiting time. It seems reasonable to assume that the 
arrivals arise at random, so an exponential distribution would be suitable 
for U. 


Question 12 (Unit A4, Subsection 2.1) 
Correct option: D (2 marks) 
The mean is given by (Handbook, A4:3) 
E(Y) = (0x 3) + (1x 3) + (2 x §) + (3 x 3) 


= al 3 3 _ 713 
=44+3+3=13. 


Question 13 (Unit A4, Subsection 3.1; Unit B1, Subsection 3.1) 
Correct option: B (2 marks). 
The variance is given by (Handbook, A4:3) 


V(X) = So (a - pn) p(z) 


Alternatively (Handbook, B1:8), 
V(X) = E(X?) — 2 
2 
= BU) ~ (8) 


DEN 2 1 2 1 2 1 T. ee 29 
= (-1) xe +0 xg+1 X 5 — 36 = 36° 


Question 14 (Unit B2, Subsection 3.2) 
Correct options: D and E (1 mark each). 


The number of defective fuses in a batch of 1000 has the binomial 
B(1000, 0.02) distribution. So by the central limit theorem (Handbook, 
B2:6), the normal distribution that can be used to calculate the 
approximate value for the probability of observing 27 or more defective 
fuses in a random sample of 1000 fuses has mean and variance given by 


mean = 1000 x 0.02 = 20, 
variance = 1000 x 0.02 x 0.98 = 19.6. 


page 3 of 10 


Question 15 (Unit A5, Section 1) 
Correct option: A (2 marks). 
Let X be the number of calls arriving per hour. Then X ~ Poisson(4), so 


(Handbook, A5:1) 


P(X =2)= ~ 0.147. 





Question 16 (Unit A5, Section 1) 
Correct option: E (2 marks). 
For X ~ Poisson(4), the required probability is (Handbook, A5:1) 
P(X >2)=1- P(X =0)—- P(X =1) 
e7440 e744! 
0! 1! 








== ~ 0.908. 


Question 17 (Unit A5, Section 3) 
Correct option: C (2 marks). 


The interval in hours between successive calls, T say, has an exponential 
distribution with parameter À = 4 (Handbook, A5:6). Ten minutes is z of 
60 minutes, so 


P(T < 10 minutes) = P (T < 4) =1—e(-4*#) ~ 0.487. 


Question 18 (Unit A5, Section 3) 
Correct option: F (2 marks). 


There are two hours between 9.00 am and 11.00am. So the number of calls 
arriving in two hours has a Poisson distribution with parameter 
(Handbook, A5:6) 


(4 per hour) x 2 = 8. 


Question 19 (Unit A3, Section 5) 
Correct option: F (2 marks). 
The probability required is (Handbook, A3:12) 
P(Z < 5) = P(Z < 4) = F (4) = 1 — ¿f = 1 — (0.3)* = 0.9919. 


Question 20 (Unit B2, Subsection 2.4) 
Correct option: A (2 marks). 


The proportion of 100-tile packs that weigh more than 10 kilograms is 
given by (Handbook, B1:5) 
10 — 9.9 


P(Y S10) ~ P| Z > = 
eam) (2> a3 


) = 1 — (2) = 1 — 0.9772 = 0.0228. 
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Question 21 (Unit B2, Subsection 2.4) 
Correct option: A (2 marks). 


Let Tioo be the total weight of 100 passengers. Then Tioo is approximately 
normal with (Handbook, B2:4) 


E(Tioo) = 100 x 70 = 7000 and V(Tioo) = 100 x 64 = 6400. 
Then 


7200 — 7000 
P(Tio0 < 7200) =P (z < | 


V6400 
= P(Z < 2.5) = 0.9938 ~ 0.994. 


Question 22 (Unit A5, Subsection 2.2) 
Correct option: B (2 marks). 
X ~ M(2), so (Handbook, 4: Table of continuous probability distributions) 


1 1 
A) SSS, 
Thus 
16 
V(W) =V(4X) = 16x V( jm 


Question 23 (Unit C2, Subsection 2.2) 

Correct option: B (2 marks). 

X ~ x? (5), so (Handbook, 4: Table of continuous probability distributions) 
V(X) =2x5=10. 

Also, Y ~ x?(3), so 
V(Y) =2x3=6. 

Then (Handbook, B1:11) 


SD(U) = /V(X =Y) = VV(X) + V(Y) = V10+6 = v16 = 4 


Question 24 (Unit B3, Subsection 3.1) 
Correct option: B (2 marks). 


If u denotes the mean birth weight in pounds, and 6 denotes the mean 
birth weight in grams, then 0 = 454 x u. 


This is an increasing function, so the corresponding 95% confidence 
interval for 0 is (Handbook, B3:3) 


(454 x 6.87, 454 x 7.45) = (3118.98, 3382.30). 
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Question 25 (Unit B3, Subsection 3.2) 
Correct option: E (2 marks). 


An estimate of p is p = 6/3000 = 0.002. The upper limit of an 
approximate 95% confidence interval for p is (Handbook, B3:4) 


P-P 0.002 x 0.998 
ptzx PUP) _ 9.902 41.96 x 4/2002 9-998 _ 9 cage. 
n 3000 


Question 26 (Unit B3, Subsection 5.1) 
Correct option: C (2 marks). 
A confidence interval for a normal mean is a t-interval (Handbook, B3:7). 


For a 90% t-interval and sample of size 15, the 0.95-quantile of 
t(n — 1) = t(14) is needed; this is 1.761. The lower limit of the t-interval is 
then 


= s 4.4 
T — tx — = 12.5 — 1.761 x —— ~ 10.5. 


vn v15 


Question 27 (Unit C1, Subsection 6.2) 
Correct option: A (2 marks). 
The required sample size is given by (Handbook, C1:12) 


o2 


t= qg (ti-a/2 = qi)", 


where o = 5, d = 2, q1—a/2 = q0.975 = 1.960 and q1- = qo.2 = —0.8416. So 


52 
n= zz (1-960 + 0.8416)? ~ 49.06, 


which is rounded up to 50. 


Question 28 (Unit D2, Subsection 2.3) 
Correct options: F and H (1 mark each). 
Sra and Sry are calculated as follows (Handbook, D2:4): 


2 2 © r)? _ 26? = 
Se= a, - Sa = 127 — = 42.5, 


i i 2 I 


Question 29 (Unit D2, Subsection 2.3) 
Correct option: E (1 mark). 
The least squares estimate of the slope parameter 8 for the regression line 
is (Handbook, D2:5) 
& B 2 


— .628 ~ 0.63. 
3. 152 0.628 ~ 0.63 
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PART 2 
In Part 2 it is very important that you show ALL of your working. 


Question 30 (Unit A2, Section 1) 
Two or more of the following comments will gain full marks. 


Boxplots are much better for comparing several groups at the same time; 
boxplots display certain numerical summaries (median, quartiles), whereas 
histograms do not; boxplots give precise locations of outliers, whereas 
histograms display them only approximately; there are no cutpoint or 

interval width specifications for boxplots (and choosing these can be 
problematic for histograms). [2] 


Question 31 (Unit A1, Subsection 3.1) 


The cutpoints between groups are different in the two histograms. [1] 


Question 32 (Unit A3, Subsection 3.1; Handbook, A3:5) 


The function is not a probability mass function because p(5) is negative 
(p(5) = —5). [1] 


Question 33 (Unit B3, Subsections 2.1 and 2.2; Handbook, B3:1) 


(a) If a large number of random samples of men of size 529 were drawn 
independently from the population of men, and on each occasion a 
90% confidence interval were calculated for the mean number of units 
consumed in the previous week, then approximately 90% of these 
intervals would contain the true mean number of units consumed by 
men in the previous week. [1] 


(b) This confidence interval defines a plausible range for the mean number 
of units consumed in the previous week. 


For example, if the true mean number of units consumed were greater 
than 20.7, then the probability of observing a sample mean less than 
or equal to 18.2 would be less than 0.05. 


(Or, if the true mean number of units consumed were less than 15.7, 
then the probability of observing a sample mean greater than or equal 
to 18.2 would be less than 0.05.) [2] 


Question 34 (Unit C1, Sections 1 and 2) 


(a) The null hypothesis would be rejected at the 5% significance level. 


These data provide evidence that the population mean is not equal 
to 3. [2] 


(b) The p value in the first analysis must be less than 0.05. So either the 
first analysis must be incorrect, or the p value of 0.15 in the second 
analysis must be wrong (or both). [1] 
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Question 35 (Unit A2, Section 1; Unit C2, Subsection 1.3) 
(a) The samples are right-skew. 


(b) The median is higher for the sample from population A, but the 
spread of values in the two samples is similar (as measured by the 
interquartile range). (You could have mentioned that the spread as 
measured by the range is greater for the sample from population B.) 


(c) (Handbook, C1:10) Since the samples are right-skew, it cannot be 
assumed that the populations are normally distributed. 


(d) (Handbook, C2:4) 


(i) The null hypothesis is that the distributions of the populations 
from which the samples are drawn are the same. 


(ii) Under the null hypothesis, 


na(na +ng +1) nang(na +npg +1) 
Ua xN E A E 
2 12 
= N(510, 2550). 


The p value is 


646 — 510 
2x PUA > 610) = 2x P (z > EEP) 
v 2550 
~ 2 x (1 — 8(2.69)) 


= 0.0072. 


(iii) There is strong evidence that the populations from which the two 


samples are drawn are not the same. 


Question 36 (Unit C1, Subsection 4.2; Handbook, C1:10) 
The hypotheses are 


Ho:n4 = Hg, Ai: pa F Hp- 
The null distribution of the test statistic is t(n1 + ng — 2) = t(8). 


The observed value of the test statistic is 


TA=T 1.92 — 3.12 
po m ~ —1.312. 


fate 20G+h 
The 0.9-quantile of t(8) is 1.397, so the p value must be greater than 


2 x 0.1 = 0.2. Hence there is insufficient evidence to reject the null 
hypothesis that the drugs produce the same mean effects. 
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Question 37 (Unit C3, Section 4; Handbook, C3:3) 


Statement | Section 

1 C: Results 
B: Methods 
C: Results 


2 

3 

4 B: Methods 
5 A: Introduction 
6 


D: Discussion 











Question 38 (Unit D1, Subsection 3.2; Handbook, D1:8) 
(a) The MLE of 8 is 8.29, the maximum value. 


(b) The maximum value in a sample is a biased estimator of 0 
(it underestimates 0). 


Question 39 (Unit D1, Subsection 3.1; Handbook, D1:4) 
The likelihood of p for the sample is given by 
L(p) = (1—p)*p x p x (1—p)*p = (1— p) °p’. 


Question 40 (Unit D2, Subsection 4.1) 


(a) A 1000 euro increase in net disposable income corresponds to 21.65 
extra television sets per 1000 people (that is, 8). 


(b) (Handbook, D2:8) The distribution of 8 is given by 
ĝ-8 

SJN Size 
For a sample of size 11 and a 95% confidence interval, the 


0.975-quantile of t(9) is required; this is 2.262. So the confidence 
interval is 


(B + 2.262 x s/./Sza) = (21.65 + 2.262 x \/1072.0/82.01) 


~ (13.47, 29.83). 


t(n — 2). 








Question 41 (Unit D2, Section 3) 


Two plots should be obtained: a normal probability plot of the residuals, 
and a residual plot of residuals against fitted values. 


The points on the normal probability plot should lie roughly along a 
straight line. 


The points on the residual plot should appear to be randomly located, and 
the variance should not appear to vary with the fitted values. 


[3] 
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Question 42 (Unit D3, Section 2; Handbook, D3:3,4) 


(a) 


(b) 


Any scatterplot showing the points lying on a non-linear, monotonic 
increasing curve would obtain the mark. [1] 


If the Pearson correlation is 1, then the points lie on a straight line 
with positive slope, so the Spearman correlation must also be 1. [1] 


Question 43 (Unit D3, Section 3) 


(a) 


(b) 


(Handbook, D3:8) The null hypothesis is that there is no association 
between experiencing jet lag and being given the treatment. [1] 


The expected frequencies for the cells are as follows. 


Jet lag No jet lag 


‘Treatment 6.5 8.5 
Placebo 6.5 8.5 


Hence the observed value of the test statistic is 
2_ (8-6.5) P (10 — 6.5)? R (12— 8.5)? (5—8.5)? 
6.5 6.5 8.5 8.5 
= 12.25 12.25 12.25 12.25 
=o 6a a eo 
~ 6.65. [2] 


There is (2 — 1)(2 — 1) = 1 degree of freedom. The 0.99-quantile of 
x7(1) is 6.63, and the 0.995-quantile is 7.88. So the p value is between 
0.005 and 0.01. [2] 


There is strong evidence against the null hypothesis that there is no 
association between experiencing jet lag and being given the 
treatment. The data suggest that the treatment suppresses jet lag. [2] 
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